Acoustic-phonetic labels in a Japanese speech database

نویسندگان

  • Kazuya Takeda
  • Yoshinori Sagisaka
  • Shigeru Katagiri
چکیده

A large sized Japanese speech database at ATR(JSDB-ATR) is introduced. Thesespeech data are transcribed in multiple ways using acoustic-phonetic symbols for various data access requests and for the convenience of fine acoustic-phonetic analysis. For multiple transcription, three types of categories are considered: linguistic and phonemic categories, acoustic event categories and some alophonic variation categories. To date, about 8500 words respectively uttered by eight professional announcers have been collected with half of them being acoustically-phonetically transcribed. INTRODUCTION Recently, the construction of speech databases has been undertaken in many languages to obtain much knowledge for speech recognition, perception and syntheses[l]-[3]. However, there are few Japanese speech database (JSDB) large enough for various research purposes.ln this paper, a large Japanese speech database that is being built at ATR (JSDB-ATR) is introduced focusing on its multiple acoustic-phonetic transcriptions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing a Chinese L2 speech database of Japanese learners with narrow-phonetic labels for computer assisted pronunciation training

For the purpose of developing Computer Assisted Pronunciation Training (CAPT) technology with more informative feedbacks, we propose to use a set of narrowphonetic labels to annotate Chinese L2 speech database of Japanese learners. The labels include basic units of “Initials”, “Finals” for Chinese phonemes and diacritics for erroneous articulation tendencies. Pilot investigations were made on t...

متن کامل

A linguistic and prosodic database for data-driven Japanese TTS synthesis

We propose a method to generate a database that contains a parametric representation of F0 contours associated with linguistic and acoustic information, to be used by data-driven Japanese text-to-speech (TTS) systems. The configuration of the database includes recorded speech, F0 contours and their parametric labels, phonetic transcription with durations, and other linguistic information such a...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Corpus of Spontaneous Japanese: Its Design and Evaluation

Corpus of Spontaneous Japanese, or CSJ, is a large-scale database of spontaneous Japanese. It contains speech signal and transcription of about 7 million words along with various annotations like POS and phonetic labels. After describing its design issues, preliminary evaluation of the CSJ was presented. The results suggest strongly the usefulness of the CSJ as the resource for the study of spo...

متن کامل

Use of a Large-scale Spontaneous Speech Corpus in the Study of Linguistic Variation

Corpus of Spontaneous Japanese, or CSJ, is a large-scale database of spontaneous Japanese. It contains speech signal and transcription of about 7 million words along with various annotations like POS and phonetic labels. After describing its design issues, the potential of the CSJ as a resource for linguistic variation study was evaluated.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1987